278 ◾ Bioinformatics
be the right moment to remove them using “cutadapt” plugin with “trim-single” or “trim-
paired” for single-end or paired-end reads, respectively.
qiime cutadapt trim-single \
--i-demultiplexed-sequences demux.qza \
--p-front GTGCCAGCMGCCGCGGTAA \
--p-error-rate 0 \
--o-trimmed-sequences trimmed-demux.qza \
--verbose
About our yoga data, now we have a clear view about its quality after we have assessed it
using “demux summarize”. Since the data is paired-end reads, we will carry out the quality
control later with clustering and denoising.
7.3.4.2 Clustering and Denoising
After the above preprocessing, the next step in the data analysis is to create features by
either clustering or denoising as discussed above. QIIME2 supports de novo, closed-refer-
ence, and open-reference clustering using VSEARCH plugin and denoising with DADA2
and deblur. Either clustering or denoising is performed to create feature tables and rep-
resentative sequences. Denoising (with DADA2 or deblur) attempts to remove the noises
generated from errors. The features generated by DADA2 and deblur are also called ampli-
con sequence variant (ASVs). Whatever you choose to continue with clustering or with
denoising, it is your sole choice. These techniques were discussed above in detail. In the
following, we will show you how to perform clustering and denoising with QIIME2.
7.3.4.2.1 Clustering
If your plan is to cluster reads into OTUs without denoising, QIIME2 provides “q2-vsearch”
plugin to do just that. This plugin has methods for the three types of clustering: de novo,
closed-reference, and open-reference. The “q2-vsearch” plugin can also perform quality
control; therefore, before running clustering, you may need to do some preprocessing to
the data. The paired-end reads must be merged before processing. In the following, we will
walk you through the steps of clustering to the point of generating feature tables and OTU
representative sequences.
7.3.4.2.1.1 Merging Paired-End Reads
If the data is paired-end reads, the forward and reverse reads must be merged before clus-
tering. The merging is achieved with “join-pairs” method of “q2-vsearch” plugin.
The artifact “demux-yoga.qza” of our example data is in the “inputs” directory. Since the
reads are paired end, we can merge them before clustering. The following script takes the
“demux-yoga.qza” artifact as an input, joins the forward and reverse reads, and creates a
new artifact for the merged reads “demux-yoga-merged.qza”:
qiime vsearch join-pairs \
--i-demultiplexed-seqs inputs/demux-yoga.qza \